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ABSTRACT 

In this paper, we report new limits on 21cm emission from cosmic reionization based on a 135 
day observing campaign with a 64-element deployment of the Donald C. Backer Precision Array for 
Probing the Epoch of Reionization (PAPER) in South Africa. This work extends the work presented 
in Parsons et al. (2014) with more collecting area, a longer observing period, improved redundancy- 
based calibration, improved fringe-rate filtering, and updated power-spectral analysis using optimal 
quadratic estimators. The result is a new 2 a upper limit on A 2 {k) of (22.4 mK) 2 in the range 0.15 < 
k < 0.5 h Mpc -1 at z = 8.4. This represents a three-fold improvement over the previous best upper 
limit. As we discuss in more depth in a forthcoming paper (Pober et al. 2015), this upper limit 
supports and extends previous evidence against extremely cold reionization scenarios. We conclude 
with a discussion of implications for future 21cm reionization experiments, including the newly funded 
Hydrogen Epoch of Reionization Array. 

Subject headings: 


1. INTRODUCTION 

The cosmic dawn of the universe, which begins with 
the birth of the first stars and ends approximately one 
billion years later with the full reionization of the inter- 
galactic medium (IGM), represents one of the last unex¬ 
plored phases in cosmic history. Studying the formation 
of the first galaxies and their influence on the primordial 
IGM during this period is among the highest priorities in 
modern astronomy. During our cosmic dawn, IGM char¬ 
acteristics depend on the matter density field, the mass 
and clustering of the first galaxies (Lidz et al. 2008), 
their ultraviolet luminosities (McQuinn et al. 2007), the 
abundance of X-ray sources and other sources of heat¬ 
ing (Pritchard & Loeb 2008; Mesinger et al. 2013), and 
higher-order cosmological effects like the relative veloc¬ 
ities of baryons and dark matter (McQuinn & O’Leary 
2012; Visbal et al. 2012). 

Recent measurements have pinned down the bright end 
of the galaxy luminosity function at z < 8 (Bouwens 
et al. 2010; Schenker et al. 2013) and have detected a 
few sources at even greater distances (Ellis et al. 2013; 
Oesch et al. 2013). In parallel, a number of indirect 
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techniques have constrained the evolution of the neutral 
fraction with redshift. Examples include integral con¬ 
straints on reionization from the optical depth of Thom¬ 
son scattering to the CMB (Planck Collaboration et al. 
2014, 2015), large-scale CMB polarization anisotropies 
(Page et al. 2007), and secondary temperature fluctu¬ 
ations generated by the kinetic Sunyaev-Zel’dovich ef¬ 
fect (Mesinger et al. 2012; Zahn et al. 2012; Battaglia 
et al. 2013; Park et al. 2013; George et al. 2014). Other 
probes of the tail end of reionization include observa¬ 
tions of resonant scattering of Lya by the neutral IGM 
toward distant quasars (the ‘Gunn-Peterson’ effect) (Fan 
et al. 2006), the demographics of Ly<a emitting galax¬ 
ies (Schenker et al. 2013; Treu et al. 2013; Faisst et al. 
2014), and the Ly<a absorption profile toward very dis¬ 
tant quasars (Bolton et al. 2011; Bosman & Becker 2015). 
As it stands, the known population of galaxies falls well 
short of the requirements for reionizing the universe at 
redshifts compatible with CMB optical depth measure¬ 
ments (Robertson et al. 2013, 2015), driving us to deeper 
observations with, e.g., JWST and ALMA, to reveal the 
fainter end of the luminosity function. 

Complementing these probes of our cosmic dawn are 
experiments targeting the 21cm “spin-flip” transition of 
neutral hydrogen at high redshifts. This signal has been 
recognized as a potentially powerful probe of the cos¬ 
mic dawn (Furlanetto et al. 2006; Morales & Wyithe 
2010; Pritchard & Loeb 2012) that can reveal large-scale 
fluctuations in the ionization state and temperature of 
the IGM, opening a unique window into the complex 
astrophysical interplay between the first luminous struc¬ 
tures and their surroundings. Cosmological redshifting 
maps each observed frequency with a particular emis¬ 
sion time (or distance), enabling 21cm experiments to 
eventually reconstruct three-dimensional pictures of the 
time-evolution of large scale structure in the universe. 
While such maps can potentially probe nearly the entire 
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observable universe (Mao et al. 2008), in the near term, 
21cm cosmology experiments are focusing on statistical 
measures of the signal. 

There are two complementary experimental ap¬ 
proaches to accessing 21 cm emission from our cosmic 
dawn. So-called “global” experiments such as EDGES 
(Bowman & Rogers 2010), the LWA (Ellingson et al. 

2013) , LEDA (Greenhill & Bernardi 2012; Bernardi et al. 
2015), DARE (Burns et al. 2012), SciHi (Voytek et al. 

2014) , BigHorns (Sokolowski et al. 2015), and SARAS 
(Patra et al. 2015) seek to measure the mean brightness 
temperature of 21cm relative to the CMB background. 
These experiments typically rely on auto-correlations 
from a small number of dipole elements to access the 
sky-aver aged 21 cm signal, although recent work is show¬ 
ing that interferometric cross-correlations may also be 
used to access the signal (Vedantham et al. 2015; Presley 
et al. 2015). In contrast, experiments targeting statisti¬ 
cal power-spectral measurements of the 21 cm signal em¬ 
ploy larger interferometers. Examples of such interferom¬ 
eters targeting the reionization signal include the GMRT 
(Paciga et al. 2013), LOFAR (van Haarlem et al. 2013), 
the MWA (Tingay et al. 2013), the 21CMA (Peterson 
et al. 2004; Wu 2009), and the Donald C. Backer Preci¬ 
sion Array for Probe the Epoch of Reionization (PAPER; 
Parsons et al. 2010). 

PAPER is unique for being a dedicated instrument 
with the flexibility to explore non-traditional experimen¬ 
tal approaches, and is converging on a self-consistent ap¬ 
proach to achieving both the level of foreground removal 
and the sensitivity that are required to detect the 21cm 
reionization signal. This approach focuses on spectral 
smoothness as the primary discriminant between fore¬ 
ground emission and the 21cm reionization signal and ap¬ 
plies an understanding of interferometric responses in the 
delay domain to identify bounds on instrumental chro- 
maticity (Parsons et al. 2012b, hereafter P12b). This 
type of “delay-spectrum” analysis permits data from 
each interferometric baseline to be analyzed separately 
without requiring synthesis imaging for foreground re¬ 
moval. As a result, PAPER has been able to adopt 
new antenna configurations that are densely packed and 
highly redundant. These configurations are poorly suited 
for synthesis imaging but deliver a substantial sensitivity 
boost for power-spectral measurements that are not yet 
limited by cosmic variance (Parsons et al. 2012a, here¬ 
after P12a). Moreover, they are particularly suited for 
redundancy-based calibration (Wieringa 1992; Liu et al. 
2010; Zheng et al. 2014), on which PAPER now relies 
to solve for the majority of the internal instrumental de¬ 
grees of freedom (dof). The efficacy of this approach was 
demonstrated with data from a 32-antenna deployment 
of PAPER, which achieved an upper limit on the 21 cm 
power spectrum A 2 (k) < (41 mK) 2 at k = 0.27 h Mpc -1 
(Parsons et al. 2014, hereafter P14). That upper limit 
improved over previous limits by orders of magnitude, 
showing that the early universe was heated from adia¬ 
batic cooling, presumably by emission from high-mass 
X-ray binaries or mini-quasars. 

In this paper, we improve on this previous result using 
a larger 64-element deployment of PAPER and a longer 
observing period, along with better redundant calibra¬ 
tion, an improved fringe-rate filtering technique, and an 



Fig. 1.— Antenna position within the PAPER-64 array. This 
analysis only makes use of east-west baselines between adjacent 
columns that have row separations of zero (black; e.g. 49-41, 41- 
47, 10-3, ...) one in the northward direction (orange; e.g. 10-41, 
3-47, 9-3, ...) or one in the southward direction (blue; e.g. 49- 
3, 41-25, 10-58, ...). Because of their high levels of redundancy, 
these baselines constitute the bulk of the array’s sensitivity for 
power spectrum analysis. 

updated power-spectrum estimation pipeline. The result 
is an upper limit on A 2 {k) of (22.4 mK) 2 in the range 
0.15 < k < 0.5 h Mpc -1 at z = 8.4. This result places 
constraints on the spin temperature of the IGM, and as 
is shown in a forthcoming paper, Pober et al. (2015), 
this supports and extends previous evidence against ex¬ 
tremely cold reionization scenarios. In Section 2 we de¬ 
scribe the observations used in this analysis. In Sections 
3 and 4, we discuss the calibration and the stability of 
the PAPER instrument. We then move on to a discus¬ 
sion of our power-spectrum analysis pipeline in Section 
5. We present our results in Section 6 along with new 
constraints on the 21cm power spectrum. We discuss 
these results in Section 7 and conclude in Section 8. 


2. OBSERVATIONS 

We base our analysis on drift-scan observations with 64 
dual-polarization PAPER antennas (hereafter, “PAPER- 
64”) deployed at the Square Kilometre Array South 
Africa (SKA-SA) reserve in the Karoo desert in South 
Africa (30:43:17.5° S, 21:25:41.8° E). Each PAPER el¬ 
ement features a crossed-dipole design measuring two 
linear (X,Y) polarizations. The design of the PAPER 
element, which features spectrally and spatially smooth 
responses down to the horizon with a FWHM of 60°, 
is summarized in Parsons et al. (2010) and Pober et al. 
(2012). For this analysis, we use only the XX and YY 
polarization cross-products. 

As shown in Figure 1, PAPER-64 employs a highly 
redundant antenna layout where multiple baselines mea¬ 
sure the same Fourier mode on the sky (PI2a; P14). We 
rely on all 2016 baselines for calibration, but only use a 
subset of the baselines for the power spectrum analysis. 
This subset consists of three types of baselines: the 30- 
m strictly east-west baselines between adjacent columns 
(e.g. 49-41, black in Figure 1; hereafter referred to as 
fiducial baselines ), 30-m east-west baselines whose east¬ 
ern element is staggered one row up (e.g. 10-41, orange 
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Fig. 2.— The Global Sky Model (de Oliveira-Cost a et al. 2008), illustrating foregrounds to the 21cm cosmological signal, with contours 
indicating beam-weighted observing time (relative to peak) for the PAPER observations described in Section 2. The map is centered at 
6:00 hours in RA. 


in Figure 1), and those whose eastern element is one 
row down (e.g. 49-3, blue in Figure 1). These baseline 
groups consist of 56, 49, and 49 baselines, respectively. 
We define a redundant group of baselines as being the 
set of baselines that have the same grid spacing; base¬ 
lines in each of the three redundant groups described 
above are instantaneously redundant and therefore mea¬ 
sure the same Fourier modes on the sky. Thus, within a 
redundant group, measurements from baselines may be 
coherently added to build power-spectrum sensitivity as 
N rather than y/N, where N is the number of baselines 
added. 

PAPER-64 conducted nighttime observations over a 
135 day period from 2012 November 8 (JD 2456240) to 
2013 March 23 (JD 2456375). Since solar time drifts with 
respect to local sidereal time (LST), this observing cam¬ 
paign yielded more samples of certain LSTs (and hence, 
sky positions) than others. For the power spectrum anal¬ 
ysis, we use observations between 0:00 and 8:30 hours 
LST. This range corresponds to a “cold patch” of sky 
away from the galactic center where galactic synchrotron 
power is minimal, but also accounts for the weighting of 
coverage in LST. Figure 2 shows our observing field with 
the contours labeling the beam weighted observing time 
relative to the peak, directly over head the array. 

The PAPER-64 correlator processes a 100-200 MHz 
bandwidth, first channelizing the band into 1024 chan¬ 
nels of width 97.6 kHz, and then cross multiplying every 
antenna and polarization with one another for a total of 
8256 cross products, including auto correlations. Follow¬ 
ing the architecture in Parsons et al. (2008), this corre¬ 
lator is based on CASPER 16 open-source hardware and 
signal processing libraries (Parsons et al. 2006). Sixteen 
ROACH boards each hosting eight 8-bit analog-to-digital 
converters digitize and channelize antenna inputs. New 

16 http://casper.berkeley.edu 


to this correlator relative to previous PAPER correlators 
(Parsons et al. 2010), the cross multiplication engine is 
implemented on eight servers each receiving channelized 
data over two 10 Gb Ethernet links. Each server hosts 
two NVIDIA GeForce 580 GPUs running the open-source 
cross-correlation code developed by Clark et al. (2013). 
Visibilities are integrated for 10.7 s on the GPUs before 
being written to disk. All polarization cross-products are 
saved, although the work presented here only made use 
of the XX and YY polarization products. 

3. CALIBRATION 

Foreground contamination and signal sensitivity repre¬ 
sent the two major concerns for 21 cm experiments tar¬ 
geting power spectrum measurements. Sources of fore¬ 
grounds include galactic synchrotron radiation, super¬ 
nova remnants, and extragalactic radio sources. In the 
low-frequency radio band (50-200 MHz) where 21cm 
reionization experiments operate, emission from these 
foregrounds is brighter than the predicted reionization 
signal by several orders of magnitude (Santos et al. 
2005; Ali et al. 2008; de Oliveira-Costa et al. 2008; Jelic 
et al. 2008; Bernardi et al. 2009, 2010; Ghosh et al. 
2011). However, the brightest foregrounds are spectrally 
smooth, and this provides an important hook for their 
isolation and removal (Liu et al. 2009a; Petrovic & Oh 
2011; Liu & Tegmark 2012). Unfortunately, interferom¬ 
eters, which are inherently chromatic instruments, inter¬ 
act with spectrally smooth foregrounds to produce un¬ 
smooth features that imitate line of sight Fourier modes 
over cosmological volumes (P12b; TMorales et al. 2006; 
Bowman et al. 2009a). One approach to solving this 
problem involves an ambitious calibration and model¬ 
ing approach to spatially localize and remove foreground 
contaminants (Liu et al. 2009b; Bowman et al. 2009b; 
Harker et al. 2009; Sullivan et al. 2012; Chapman et al. 
2013). Perhaps the most impressive example of this ap¬ 
proach is being undertaken by LOFAR, where dynamic 
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Fig. 3. — The stages of power-spectrum analysis. Black lines indicate data flow; red lines indicate Monte Carlo simulations used to 
measure signal loss. Yellow boxes indicate stages that by construction have negligible signal loss. Signal loss in other stages is tabluted in 
Table 1. 


ranges of 4.7 orders of magnitude have been achieved 
in synthesis images (Yatawatta et al. 2013), although 
it is expected that additional suppression of smooth- 
spectrum foreground emission will be necessary (Chap¬ 
man et al. 2013). 

The analysis for this paper employs a contrasting ap¬ 
proach based on the fact that the chromaticity of an in¬ 
terferometer is fundamentally related to the length of an 
interferometric baseline. This relationship, known collo¬ 
quially as “the wedge,” was derived analytically (PI2b; 
Vedantham et al. 2012; Thyagarajan et al. 2013; Liu et al. 
2014a,b), and has been confirmed in simulations (Datta 
et al. 2010; Hazelton et al. 2013) and observationally 
(Pober et al. 2013; Dillon et al. 2014). As described 
in PI2b, the wedge is the result of the delay between 
when a wavefront originating from foreground emission 
arrives at the two antennas in a baseline. The fact that 
this delay is bounded by the light-crossing time between 
two antennas (which we call the “horizon limit” since 
such a wavefront would have to originate from the hori¬ 
zon) places a fundamental bound on the chromaticity of 
an interferometric baseline. So far, PAPER has had the 
most success in exploiting this bound (P14; Jacobs et al. 
2015). In this analysis, we continue to use the proper¬ 
ties of the wedge in order to isolate and remove smooth 
spectrum foregrounds. 

As illustrated in Figure 3, our analysis pipeline begins 
by running a compression algorithm to reduce the vol¬ 
ume of our raw data by a factor of 70. As described in 
Appendix A of P14, this is achieved by first performing 
statistical flagging to remove radio frequency interference 
(RFI) at the 6<r level, applying low-pass delay and fringe- 
rate filters that limit signal variation to delay scales of 
\t\ < 1/is and fringe-rate scales of / < 23 mHz, and then 
decimating to critical Nyquist sampling rates of 493 kHz 


along the frequency axis and 42.9 s along the time axis. 
We remind the reader that while information is lost in 
this compression, these sampling scales preserve emission 
between —0.5 < fey < 0.5 h Mpc -1 that rotates with the 
sky, making this an essentially lossless compression for 
measurements of the 21 cm reionization signal in these 
ranges. 

After compression, we calibrate in two stages, as de¬ 
scribed in more detail below. The first stage (Section 
3.1) uses instantaneous redundancy to solve for the ma¬ 
jority of the per-antenna internal dof in the array. In the 
second stage (Section 3.2), standard self-calibration is 
used to solve for a smaller number of absolute phase and 
gain parameters that cannot be solved by redundancy 
alone. After suppressing foregrounds with a wide-band 
delay filter (Section 3.3) and additional RFI flagging and 
crosstalk removal, we average the data in LST (Section 
3.4) and apply a fringe-rate filter (Section 3.5) to com¬ 
bine time-domain data. Finally, we use an OQE (Section 
5) to make our estimate of the 21 cm power spectrum. 

3.1. Relative Calibration 

Redundant calibration has gained attention recently 
as a particularly powerful way to solve for internal dof in 
radio interferometric measurements without simultane¬ 
ously having to solve for the distribution of sky bright¬ 
ness (Wieringa 1992; Liu et al. 2010; Noorishad et al. 
2012; Marthi & Chengalur 2014; Zheng et al. 2014; P14). 
The grid-based configuration of PAPER antennas allows 
a large number of antenna calibration parameters to be 
solved for on the basis of redundancy (P14; PI2a; Zheng 
et al. 2014). Multiple baselines of the same length and 
orientation measure the same sky signal. Differences be¬ 
tween redundant baselines result from differences in the 
signal chain, including amplitude and phase effects at¬ 
tributable to antennas, cables, and receivers. Redundant 
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calibration only constrains the relative complex gains be¬ 
tween antennas and is independent of the sky. Since re¬ 
dundant calibration preserves signals common to all re¬ 
dundant baselines, this type of calibration does not result 
in signal loss. 

In practice, redundant calibration often takes on two 
flavors: log calibration (LOGCAL) and linear calibration 
(LINCAL) (Liu et al. 2010; Zheng et al. 2014). LOGCAL 
uses logarithms applied to visibilities, 

Vij = 9i9jVi-j + <f, (1) 

where g denotes the complex gain of antennas indexed by 
i and j, and y represents the “true” visibility measured 
by the baseline, to give a linearized system of equations 


log Vij = log g* + log gj + log , (2) 


In solving for per-antenna gain parameters with a num¬ 
ber of measurements that scales quadratically with an¬ 
tenna number, redundancy gives an over-constrained sys¬ 
tem of equations that can be solved using traditional lin¬ 
ear algebra techniques. While LOGCAL is useful for ar¬ 
riving at a coarse solution from initial estimates that are 
far from the true value, LOGCAL has the shortcoming 
of being a biased by the asymmetric behavior of additive 
noise in the logarithm (Liu et al. 2010). 

LINCAL, on the other hand, uses a Taylor expansion 
of the visibility around initial estimates of the gains and 
visibilities, 


0* 0 0 . 1* 0 0 I o* 1 0 I 0* 0 1 /o\ 

Vij=g\ gjVi-j+gi g^Vi-j+gi g^Vi-i, (3) 


where 0 denotes initial estimates and 1 represents the 
perturbation to the original estimate and is the solutions 
we fit for. Using initial estimates taken from LOGCAL, 
LINCAL constructs an unbiased estimator. 

Redundant calibration was performed using OMNI- 
CAL 17 — an open-source redundant calibration package 
that is relatively instrument agnostic (Zheng et al. 2014). 
This package implements both LOGCAL and LINCAL, 
solving for a complex gain solution per antenna, fre¬ 
quency, and integration. The solutions are then applied 
to visibilities and the results are shown in Figure 4. 

In addition to solving for gain solutions, OMNICAL 
also characterizes the quality of the calibration parame¬ 
ters by calculating the y 2 for every integration. As de¬ 
fined in Zheng et al. (2014), 


2/dof. The calculated y 2 /dof for every frequency and 
integration of a fiducial day of observation in this season 
and for the fiducial power spectrum baselines is shown 
in Figure 5, demonstrating the stability of the PAPER 
instrument. 

We measure a mean y 2 /dof of 1.9. This indicates 
that the redundant calibration solutions, while a sub¬ 
stantial improvement over the previous PAPER-32 cali¬ 
bration (P14), do not quite result in residuals that are 
thermal noise dominated. Possible sources of this excess 
include instrumental crosstalk and poorly performing sig¬ 
nal chains. While the latter will be down-weighted by 
the inverse of the estimated signal covariance described 
in Section 5, crosstalk is a defect in the data that must 
be addressed. Crosstalk caused by the cross-coupling of 
signals between antennas reveals itself as a static com¬ 
plex bias to a visibility that varies on timescales much 
longer than typical fringe rates. This effect skews the 
distribution of the y 2 of the residuals away from 1. To 
minimize crosstalk, we first use OMNICAL to solve for 
antenna-dependent gains, and then average the residual 
deviations from redundancy over 10 minute windows be¬ 
fore subtracting the average from the original visibilities. 
This crosstalk removal preserves signals common to re¬ 
dundant baseline groups (such as the 21 cm signal). Un¬ 
fortunately, it also preserves a term that is the average 
of the crosstalk of all baselines in the redundant group. 
This residual crosstalk is removed by a fringe-rate filter 
later in the analysis. 

3.2. Absolute Calibration 

After solving for the relative complex gains of the an¬ 
tennas using redundant calibration, an overall phase and 
gain calibration remains unknown. We use the standard 
self calibration method for radio interferometers to solve 
for the absolute phase calibration. We used Pictor A, 
Fornax A, and the Crab Nebula to fit for the overall 
phase solutions. Figure 6 shows an image of the field 
with Pictor A (5:19:49.70, -45:46:45.0) and Fornax A 
(3:22:41.70,-37:12:30.0). 

We then set our over all flux scale by using Pictor A 
as our calibrator source with source spectra derived in 
Jacobs et al. (2013), 

Sv = Sl50 X (l50 mHz) ’ ^ 


^2 = '— ij ~ I 


4 


( 4 ) 


where a 2 is the noise in the visibilities. The y 2 measures 
sum of the deviation of measured visibilities to that of 
the best fit model derived from the LINCAL relative to a 
noise model, and gives us a tool to use in order to check 
the quality of our data. The number of dof, as defined 
in Zheng et al. 2014, is given by 


dof = N r 


measurements 


-N 


parameters 


( 5 ) 


2Ab aS elines 2(A antennas T A un iq ue baselines )•> 


and is effectively the number of visibilities for which y 2 
is calculated. If the data are noise-dominated, y 2 /dof 
is drawn from a y 2 distribution with fi = 1 and a 2 = 


where 6150 = 381.88 Jy =b 5.36 and a = —0.76 db 0.01, 
with lcr error bars. 

To derive the source spectrum from our measurements, 
we use data that have been LST-averaged prior to the 
wide-band delay filter described in Section 3.3, for the 
hour before and after the transit of Pictor A. We image a 
30° x 30° field of view for every frequency channel for each 
10 minute snapshot and apply uniform weights to the 
gridded visibilities. We account for the required three- 
dimensional Fourier transform in wide field imaging by 
using the w-stacking algorithm implemented in WSclean 
(Offringa et al. 2014) although we note that the standard 
w-projection algorithm implemented in CASA 18 gives 
similar performances as the PAPER array is essentially 
instantaneously coplanar. A source spectrum is derived 
for each snapshot by fitting a two-dimensional Gaussian 


17 https://github.com/jeffzhen/omnical 


18 http://casa.nrao.edu 
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Before OMNICAL 



Real [kjy] 


After OMNICAL 



Real [kjy] 


Fig. 4. — PAPER visibilities plotted in the complex plane before (left) and after (right) the application of the improved redundancy- 
based calibration with OMNICAL (Zheng et al. 2014). All baselines in the array measured at 159 MHz for a single time integration are 
plotted. Instantaneously redundant baselines are assigned the same symbol/color. The tighter clustering of redundant measurements with 
OMNICAL indicates improved calibration. 
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Fig. 5. — Log of % 2 per degree of freedom of all baseline residuals 
after the application of OMNICAL. The plot comprises a observa¬ 
tions over one day, with a frequency resolution of 493 kHz and a 
time resolution of 42.9 s. 


to Pictor A by using the PyBDSM 19 source extractor. 
Spectra are optimally averaged together by weighting 
them with the primary beam model evaluated in the di¬ 
rection of Pictor A. To fit our bandpass, we divide the 
model spectrum by the measured one and fit a 9th or¬ 
der polynomial over the 120-170 MHz frequency range. 
Figure 7 shows the calibrated Pictor A spectrum and the 
model spectrum from Jacobs et al. (2013). Also plotted 
are the la error bars derived from the PyBDSM source 
extractor and averaged over the multiple snapshots used 
after being weighted by the beam-squared. 

Fitting a polynomial to the bandpass has the potential 
for signal loss which would include suppressing modes 
that may contain the cosmological signal. In order to 
quantify the signal loss associated with fitting a ninth 
degree polynomial to the bandpass, we run a Monte Carlo 
simulation of the effect the bandpass has on a model 21- 
cm reionization signal. We construct a model baseline 
visibility as a Gaussian random signal multiplied by the 
derived bandpass for every independent mode measured. 

19 http://www.lofar.org/wiki/doku.php?id=public:user.software: 



Fig. 6.— PAPER-64 image of a field including Pictor A and For¬ 
nax A, with white circles indicating catalog positions (Jacobs et al. 
2011). Image was synthesized with two hours of visibilities while 
Pictor A was in transit and 53 MHz of instantaneous bandwidth 
from 120 to 173 MHz. Image quality is limited by the redundant 
configuration of the array (e.g. grating lobes as a result of periodic 
antenna spacing, elongated lobes arising from poor uv-coverage in 
the north-south direction). Nonetheless, this image demonstrates 
accurate phase calibration over a wide field of view. 


We calculate the total number of independent modes by 
counting the number of independent uv-modes sampled 
for the different baseline types over the two hour time 
interval used to measure the bandpass. We average each 
mode together and fit a 9th degree polynomial. Using 
this as our measured bandpass for this simulated signal, 
we finally compare the power spectrum from the output 
of the simulated signal to the input power spectrum as a 
function fo k- mode. We find that between —0.06 < k < 
0.06, the width of our wideband delay filter described 
below, the signal loss is less than 3% and at the mode 
pybdright outside the above limit is2x!0 _r %. We apply the 
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Fig. 7.— Measured spectrum of Pictor A in Stokes I (blue) rel¬ 
ative to its catalog value (black; Jacobs et al. 2013). Flux density 
measurements are extracted from images of Pictor A, made inde¬ 
pendently for each frequency channel in 10 minutes snapshots as 
Pictor transits between hour angles of -1:49 and 1:10. Each mea¬ 
surement is then divided by the PAPER beam model and averaged 
to obtain the measured spectrum, which serves to characterize the 
flux scale of the PAPER-64 observations. Error bars indicate 68% 
confidence intervals, derived from the Gaussian fits in the source 
extractor used to measure the flux density in PyBDSM, combined 
from all snapshots. 

latter correction factor for all modes outside the width 
of the delay filter to the final power spectrum. 

3.3. Wideband Delay Filtering 

Before implementing our foreground removal tech¬ 
niques, we combine the two linear polarizations for an 
estimate of Stokes I as per Moore et al. (2013). Namely, 
Stokes I can be estimated as 

Vi = ^(V xx + V YY ), (7) 

where Vxx and Vyy are the visibilities of the two linear 
polarizations measured by the interferometer. There are 
some important caveats to the estimate of Stokes I pro¬ 
vided by Equation (7). One important caveat is that it 
neglects the beam asymmetry between the two linear po¬ 
larization states. This mismatch can cause polarization 
leakage from Stokes Q into Stokes I, thus contaminating 
our measurement of the power spectrum with any polar¬ 
ized emission from the sky. This effect for PAPER, as 
shown in Moore et al. (2013), leaks 4% of Q in to I in 
amplitude (2.2 x 10 -3 in the respective power spectra). 
We take the conservative approach and do not correct 
for this effect, noting that the leakage of Q in to I will 
result in positive power, increasing our limits. 

Foreground removal techniques discussed in the liter¬ 
ature include spectral polynomial fitting (Wang et al. 
2006; Bowman et al. 2009a; Liu et al. 2009a), prin¬ 
cipal component analysis (Paciga et al. 2011; Liu & 
Tegmark 2011; Paciga et al. 2013; Masui et al. 2013), 
non-parametric subtractions (Harker et al. 2009; Chap¬ 
man et al. 2013), and inverse covariance weighting (Liu 
& Tegmark 2011; Dillon et al. 2013, 2014; Liu et al. 
2014a,b), Fourier-mode filtering Petrovic & Oh (2011), 
and per-baseline delay filtering described in P12b. This 
delay-spectrum filtering technique is well-suited to the 
maximum redundancy PAPER configuration which is 


not optimized for the other approaches where high fi¬ 
delity imaging is a prerequisite. The delay-spectrum fore¬ 
ground filtering method is described in detail by P14; its 
application is unchanged here. In summary; we Fourier 
transform each baseline spectrum into the delay domain 

V T = J W v A u I u e~ 2,riT » ■ e 2nirv du (8) 
= W T * A r * I T * S(r g — r), 

where A v is the frequency dependent antenna response, 
W v is a sampling function that includes RFI flagging 
and a Blackman-Harris tapering function that minimizes 
delay-domain scattering from RFI flagging, and I v is the 
source spectrum. In the delay domain, a point source ap¬ 
pears as a ^-function at delay r^, convolved by the Fourier 
transforms of the source spectrum, the antenna response, 
and the sampling function. We note that the antenna 
response effectively determines a finite bandpass, which 
imposes a lower bound of 1/B « 10 ns on the width of 
any delay-domain convolving kernel. As per Parsons & 
Backer (2009) and P14, we deconvolve the kernel result¬ 
ing from W (r) using an iterative CLEAN-like procedure 
(Hogbom 1974) restricting CLEAN components to fall 
within the horizon plus a 15-ns buffer that includes the 
bulk of the kernels convolving the ^-function in Equa¬ 
tion (8). To remove the smooth spectrum foreground 
emission we subtract the CLEAN components from the 
original visibility. 

Applying the delay filter to fiducial baselines used in 
the power spectrum analysis, foregrounds are suppressed 
by rsj 4 orders of magnitude in power, or —40 dB of fore¬ 
ground suppression, as seen in Figure 8. As discussed 
in P14, there is a small amount of signal loss associated 
with this filter. For the baselines and filter parameters 
used, the loss was found to be 4.8% for the first mode 
outside of the horizon, 1.3% for the next mode out, and 
less than 0.0015% for the higher modes. 

3.4. Binning in LST 

After the wideband delay filter, we remove a second 
layer of RFI which was overshadowed by the foreground 
signal. RFI are excised with a filter which flags values 
3cr above the median using a variance calculated in a 
localized time and frequency window. 

We then average the entire season in LST with 43- 
s bin widths, matching the cadence of the compressed 
data. The full season was 135 days long; of these, 124 
days were included in the average. We make two separate 
LST-binned data sets, averaging every other Julian day 
together to obtain an “even” and “odd” dataset. The use 
of these two data sets allows us to construct an unbiased 
power spectrum estimate. 

Sporadic RFI events result in measurements that, in 
any individual LST bin, deviate from the Gaussian dis¬ 
tribution characteristic of thermal noise. To catch these 
events, we compute the median of a LST bin for each 
frequency and flag values 3cr above the median, before 
averaging. Since we are narrowing the distribution of vis¬ 
ibilities about the median, the measured thermal noise 
variance is not preserved under this filter. However, since 
the central value is preserved, the expectation value of 
the measured visibility in each LST bin is unchanged, 
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Fig. 8.— Visibilities measured by a fiducial baseline in the PAPER-64 array, averaged over 135 days of observation. From left to right, 
columns represent data that: (1) contain foregrounds prior to the application of a wideband delay filter or fringe-rate filtering, (2) are 
fringe-rate filtered but not delay filtered, (3) are delay filtered at 15 ns beyond the horizon limit but are not fringe-rate filtered, (4) are both 
delay and fringe-rate filtered, and (5) are delay and fringe-rate filtered and have been averaged over all redundant measurements of this 
visibility. The top row shows signal amplitude on a logarithmic scale; the bottom row illustrates signal phase. Dashed lines indicate the 
0:00-8:30 range in LST used for power spectrum analysis. The putative crosstalk is evident in the center panel as constant phase features 
which do not fringe as the sky. The two right panels show some residual signal in the phase structure which is present at low delay. Away 
from the edges of the observing band, over four orders of magnitude of foreground suppression is evident. 


and there is no associated signal loss for power spec¬ 
trum measurements. Moreover, because errors are es¬ 
timated empirically through bootstrapping (see Section 
5.4), the slight increase in measurement error associated 
with truncating the tails of the Gaussian distribution are 
naturally accounted for. 

3.5. Fringe-rate Filtering 

By averaging visibilities in time, we aim to maximize 
sensitivity by coherently combining repeated measure¬ 
ments of k -modes before squaring these measurements 
and averaging over independent k -modes to estimate the 
power spectrum amplitude. This is mathematically sim¬ 
ilar to the more traditional process of gridding in the uv 
plane, but applied to a single baseline. However, rather 
than applying a traditional box-car average, we can ap¬ 
ply a kernel — a so-called “fringe-rate” filter — that 
weights different temporal rates by the antenna beam 
corresponding to the parts of the sky moving at the same 
rate. 


For a given baseline and frequency, different parts of 
the sky exhibit different fringe-rates. Maximum fringe 
rates are found along the equatorial plane, where the ro¬ 
tation rate of the sky is highest, and zero fringe rates 
are found at the poles, where the sky does not rotate 
and hence sources do not move through the fringes of 
a baseline (Parsons & Backer 2009). Fringe rates are 
not constant as a function of latitude. Bins of constant 
fringe rate correspond to rings in R.A. and deck, where 
the east-west projection of a baseline projected toward 
a patch of the sky is constant. We use this fact in con¬ 
junction with the root-mean-squared beam response for 
each contour of constant fringe rate to construct a time 
average kernel or “fringe-rate filter.” 

As examined in Parsons et al. (2015), it is possible 
to tailor fringe-rate filters to optimally combine time- 
ordered data for power-spectrum analysis. Fringe-rate 
filters can be chosen that up-weight points of the sky 
where our instrument is more sensitive and down-weight 
those points farther down in the primary beam, which 
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are less sensitive. For white noise, all fringe-rate bins 
will contain the same amount of noise, but the amount 
of signal in each bin is determined by the primary beam 
response on the sky. By weighting fringe-rate bins by the 
rms of the beam response, we can get a net increase in 
sensitivity. 

Applying this filter effectively weights the data by an¬ 
other factor of the beam area, changing the effective pri¬ 
mary beam response 20 , A(l,m) (Parsons et al. 2015). By 
utilizing prior knowledge about the beam area, we are 
selectively down-weighting areas on the sky contributing 
little signal. This will result in a net improvement in sen¬ 
sitivity depending on the shape of the beam and the deck 
of the array. For PAPER, this filter roughly doubles the 
sensitivity of our measurements. 

Generally, a fringe-rate filter integrates visibilities in 
time. For a fringe-rate filter, /f r , the effective integra¬ 
tion time can be calculated by comparing the variance 
statistic before and after filtering: 

+ + f rCi \ 

^int,after ^int,before ~r o no •> (9) 

JtffM 

where tint, before is the integration time before filtering, 
cif denotes the noise variance in fringe rate space and 
the integral is taken over all possible fringe rates for a 
given baseline and frequency. As discussed in Parsons 
et al. (2015), the signal re-weighting associated with this 
fringe-rate filter can be interpreted as a modification to 
the shape of the primary beam. 

For the fiducial baseline at 151 MHz, the integration 
time, as given in equation (9), associated with an optimal 
fringe rate filter is 3430 s. The number of statistically 
independent samples on the sky decreases from 83 to 1 
sample per hour. As discussed in section 5.3, empirically 
estimating a covariance matrix with a small number of 
independent samples can lead to signal loss in the OQE. 
In order to counteract the signal loss, we degrade the 
optimal fringe-rate filter, as shown in Figure 9, to have 
an effective integration time of 1886 s, increasing the 
number of independent modes to 2 per hour. The fringe 
rate filter is now sub-optimal, but is still an improvement 
on the boxcar weighting as used in P14. As documented 
in Table 1, the correction factor for the associated signal 
loss of the filter we have chosen is 1.39. 

We implement the modified filter on a per baseline ba¬ 
sis by weighting the fringe-rate bins on the sky by the 
RMS of the beam at that same location. In order to ob¬ 
tain a smooth filter in the fringe-rate domain, we fit a 
Gaussian with a hyperbolic tangent tail to this filter. In 
addition, we multiply this response with another hyper¬ 
bolic tangent function that effectively zeros out fringe 
rates below 0.2 mHz. This removes the slowly varying 
signals that we model as crosstalk. We convolve the 
time-domain visibilities with the Fourier transform of the 
resulting fringe-rate filter, shown in Figure 9, to produce 
an averaged visibility. The effect on the data can be seen 
in Figure 8. 

4. INSTRUMENTAL PERFORMANCE 
4.1. Instrument Stability 

20 The angular area in Equation (24) will reflect the new angular 
area corresponding to the change in beam area. 



Fig. 9. — The optimal fringe-rate filter (orange) that and the 
degraded fringe-rate filter (blue) actually used in the analysis at 
151 MHz, normalized to peak at unity. 

In order to build sensitivity to the 21cm reionization 
signal, it is critical that PAPER be able to integrate co¬ 
herently measurements made with different baselines on 
different days. Figure 10 shows the visibility repeatabil¬ 
ity between baselines and nights as a function of LST. 
Specifically, we histogram the real part of the visibilities 
for all redundant fiducial baselines in a given LST bin for 
foreground contained data. We see that for a given LST 
bin, the spread in values over all the baselines is ^50 Jy 
which corresponds with our observed F S y S 500K. We get 
more samples per LST bin in the range of 2-10 hr due to 
our observing season, therefore the density of points in 
this LST region is greater, as shown by the color scale. 
This density plot shows that redundant baselines agree 
very well with one another; OMNICAL has leveled the 
antenna gains to within the noise. 

Delving in a little deeper, we also examine the stabil¬ 
ity in time for measurements in a particular LST bin. 
In order to quantify the stability in time we extract one 
channel for a given baseline for every observation day 
and LST bin. We then Fourier transform along the time 
direction for every LST bin and compute the power spec¬ 
trum. As shown in Figure 11, for time scales greater than 
one day, we see that signal variance drops by almost four 
orders of magnitude, with the exception of an excess on 
two-day timescales caused by the changing alignment of 
the 42.9 s integration timescale relative to a sidereal day. 
The implication of this measurement is that, after cali¬ 
bration, PAPER measurements are sufficiently stable to 
be integrated coherently over the entire length of a 135 
day observation. This implies day-to-day stability of bet¬ 
ter than 1%, contributing negligibly to the uncertainties 
in the data. 

4.2. System Temperature 

During the LST binning step, the variance of the visi¬ 
bilities that are averaged together for a given frequency 
and LST bin are recorded. Using these variances, we 
calculate the system temperature as a function of LST, 
averaging over each LST hour. 

T rms = T sys /V2Art, (10) 
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Fig. 10. — Histogram of the real component of all calibrated visibilities measured over 135 days with every redundant instance of the 
fiducial baseline at 150 MHz. Color scale indicates the number of samples falling in an LST/flux-density bin. This plot serves to illustrate 
the stability of the PAPER instrument and the precision of calibration. The temporal stability of a single LST bin over multiple days is 
shown in Figure 11. 



Fig. 11. — Power spectrum of 135 days of time-series data con¬ 
tributing to a single LST bin, illustrating the stability of measure¬ 
ments over the observing campaign. Relative to the average value, 
variation in the measured value across days (quantified by variance 
as a function of time period) is orders of magnitude lower. The ex¬ 
cess at two-day timescales is a beat frequency associated with the 
changing alignment of integration windows in the correlator with 
respect to sidereal time. 

where A v is the bandwidth, t is the integration time, 
and T rms is the RMS temperature, or the variance statis¬ 
tic described above. Figure 12 shows the results of this 
calculation. In this observing season, the system tem¬ 
perature drops just below previous estimates as in P14 
and Jacobs et al. (2015) of T sys = 560 K, at T sys = 500 K 
at 160 MHz. However, this estimate is more consistent 
with the results derived in (Moore et al. 2015), where 
T sys = 505 K at 164 MHz. The change in the system 
temperature can be attributed to the reduced range of 
LST used in the calculation. We note that at 7:00 LST, 
there is an increase in the system temperature due to the 
rising of the galactic plane as seen in Figure 2. 

When calculating the system temperature using the 
variance in the visibilities for a given LST and frequency, 
we take into account the fact that we flag 3cr outliers from 
the median. To calculate an effective correction factor to 
account for the filtering, we assume the visibilities follow 
a Gaussian distribution which would require a correction 
factor of 1.34 for the removal of data points that are 3 a 
above the median. In other words, we are accounting for 



Fig. 12.— System temperature, inferred from the variance of 
samples falling in an LST bin, averaged over one-hour intervals in 
LST. The measured value in the 150-160 MHz range is consistent 
with previous determinations of system temperature (Jacobs et al. 
2015; P14). 

the wings of the Gaussian that would contribute to the 
variance in the visibility. 

Previous estimates of the system temperature (P14; 
Jacobs et al. 2015) relied on differencing and averag¬ 
ing baselines, time samples, and/or frequency channels. 
The relative agreement between these various methods 
of estimating the system temperature provides a robust 
measure of the system temperature of the PAPER in¬ 
strument. Agreement between the instantaneous mea¬ 
surements of the system temperature, the LST repetition 
variance, and the predicted power spectrum noise level 
(see below) indicates a robustly stable system with no 
significant long term instability contributing appreciable 
noise. 

5. POWER SPECTRUM ANALYSIS 

In this section we first review the OQE formalism, fol¬ 
lowed by a walk-through of our particular applications of 
the OQE method to our data. Finally, we discuss the ef¬ 
fects of using an empirically estimated covariance matrix 
in our analysis. 

5.1. Review of OQEs 

We use the OQE method to estimate our power spec¬ 
trum as done in Liu & Tegmark (2011), Dillon et al. 
(2013), Liu et al. (2014a), Liu et al. (2014b), and Trott 
et al. (2012). Here we briefly review the OQE formal¬ 
ism with an emphasis on our application to data, which 
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draws strongly from the aforementioned works, but also 
relies on empirical techniques similar to those used in 
P14. The end goal of this analysis is to estimate the 
21cm power spectrum, P 2 i(k), defined such that 

<T b (k)T ft *(k')> = (27r) 3 <5 I5 (k - k')P 2 i(k), (11) 

where TJ,(k) is the spatial Fourier transform of the bright¬ 
ness temperature distribution on the sky, () denotes an 
ensemble average, and £ D is the Dirac delta function. 

In order to make an estimate of the power spectrum 
in the OQE formalism, one begins with a data vector x. 
This vector could, for example, consist of a list of bright¬ 
ness temperatures on the sky for an imaging-based data 
analysis, or (in our case) a list of measured visibilities. 
We form the intermediate quantity, 

q a = ^x f C~ 1 Q a C~ 1 x-b a , (12) 

which will be needed to form the OQE of our power spec¬ 
trum. Here, C = (xx^) is the true covariance matrix of 
the data vector x, Q a is the operator that takes visibil¬ 
ities into power spectrum k -space and bins into the ath 
bin, and b a is the bias to the estimate that needs to be 
subtracted off. In general, Q a represents a family of ma¬ 
trices, one for each k bin indexed by a. Each matrix is 
defined as Q a = J^p-, i.e., the derivative of the covari¬ 
ance matrix with respect to the band power p a . The 
bandpower p a can be intuitively thought of as the value 
of the power spectrum in the ath k bin. Therefore, Q a 
encodes the response of the data covariance matrix to 
the ath bin of the power spectrum. 

The bias term b a in Equation (12) will include contri¬ 
butions from both instrumental noise and residual fore¬ 
grounds. Their presence in the data is simply due to the 
fact that both contributions have positive power. One 
approach to dealing with these biases is to model them 
and to subtract them off, as is suggested by Equation 
(12). An alternate approach is to compute a cross-power 
spectrum between two data sets that are known to have 
the same sky signal but independent instrumental noise 
realizations. Labeling these two data sets as xi and x 2 
and computing 

Qa = ^x|C _1 Q q C _1 x 2 , (13) 

one arrives at a cross-power spectrum that by construc¬ 
tion has no noise bias. There is thus no need to explicitly 
model and subtract any noise bias, although any resid¬ 
ual foreground bias will remain, since it is a contribution 
that is sourced by signals on the sky, and therefore must 
exist in all our data sets. 

The set of q a s do not yet constitute a properly nor¬ 
malized estimate of the power spectrum (as evidenced, 
for example, by the extra factors of C -1 ). To normalize 
our results, we group the unnormalized bandpowers into 
a vector q and apply a matrix M (whose exact form we 
specify later), so that 

p = Mq (14) 

is a normalized estimate p of the true power spectrum 
p. We emphasize that the vector space that contains q 
and p is an “output” vector space over different fc-bins, 
which is separate from the “input” vector space of the 
measurements, in which x and C reside. 


To select an M matrix that properly normalizes the 
power spectrum, we must compute the window function 
matrix W for our estimator. The window matrix is de¬ 
fined such that the true bandpowers p and our estimates 
p of them are related by 

P = Wp, (15) 

so that each row gives the linear combination of the true 
power that is probed by our estimate. With a little al¬ 
gebra, one can show that 

W = MF, (16) 

where 

¥ a0 = ^ti(C‘*Q,,C 'Q,). (17) 

which we have suggestively denoted with the symbol F 
to highlight the fact that this turns out to be the Fisher 
information matrix of the bandpowers. In order to inter¬ 
pret each bandpower as the weighted average of the true 
bandpowers, we require each row of the window func¬ 
tion matrix to sum to unity. As long as M is chosen in 
such a way that W satisfies this criterion, the resulting 
bandpower estimates p will be properly normalized. 

Beyond the normalization criterion, a data analyst has 
some freedom over the precise form of M, which effec¬ 
tively also re-bins the bandpower estimates. One pop¬ 
ular choice is M = F -1 , which implies that W = I. 
Each window function is then a delta function, such that 
bandpowers do not contain leakage from other bins, and 
contain power from only that bin. However, the disad¬ 
vantage of this becomes apparent if one also computes 
the error bars on the bandpower estimates. The error 
bars are obtained by taking the square root of the diag¬ 
onal of the covariance matrix, which is defined as 

£ = Cov(p) = (pp f ) - (p)(p) f . (18) 

Since p = Mq, it is easily shown that 

£ = MFM f . (19) 

The choice of M = F -1 tends to give rather large error 
bars. At the other extreme, picking M Q ^ oc S a p/F aa 
(with the proportionality constant fixed by our normal¬ 
ization criterion) leads to the smallest possible error bars 
(Tegmark 1997), at the expense of broader window func¬ 
tions. In our application of OQEs in the following sec¬ 
tions, we will pick an intermediate choice for M, one that 
is carefully tailored to avoid the leakage of foreground 
power from low k modes to high k modes. 

5.2. Application of OQE 

Here we describe the specifics of our application of the 
OQE formalism to measure the power spectrum. Doing 
so requires defining various quantities such as x, C, Q a 
for our analysis pipeline. 

First, we consider x, which represents the data in our 
experiment. Our data set consists of visibilities as a func¬ 
tion of frequency and time for each baseline in the array. 
In our analysis, we group the baselines into three groups 
of redundant baselines (described in Section 2), in the 
sense that within each group there are multiple copies 
of the same baseline. In the description that follows, 
we first estimate the power spectrum separately for each 
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group. Power spectrum estimates obtained from the dif¬ 
ferent redundant groups are then combined in a set of 
averaging and bootstrapping steps described in Section 
5.4. Note that because our data have been fringe-rate 
filtered in the manner described in Section 3.5, we may 
reap all the benefits of coherently integrating in time sim¬ 
ply by estimating the power spectrum for every instant 
in the LST-binned data before averaging over the time- 
steps within the LST-binned day (Parsons et al. 2015). 

For the next portion of our discussion, consider only 
the data within a single redundant group. Within each 
group there are not only multiple identical copies of the 
same baseline, but in addition (as discussed in Section 
3.3), our pipeline also constructs two LST-binned data 
sets, one from binning all even-numbered days in our ob¬ 
servations, and the other from all odd-numbered days. 
Thus, we have not a single data vector, but a whole fam¬ 
ily of them, indexed by baseline (i) and odd versus even 
days (r). Separating the data out into independent sub¬ 
groups allows one to estimate cross-power spectra rather 
than auto-power spectra in order to avoid the noise bias, 
as discussed in the previous section. The data vectors 
take the form 


V) = 


V ri (u 2 ,t) 

v ; 


( 20 ) 


where V ri (i /, £) is the visibility at frequency v at time t. 
Each data vector is 20 elements long, being comprised of 
20 channels of a visibility spectrum spanning 10 MHz of 
bandwidth centered on 151.5 MHz. 

Having formed the data vectors, the next step in Equa¬ 
tion (12) is to weight the data by their inverse covari¬ 
ance. To do so, we of course require the covariance ma¬ 
trix C, which by definition, is the ensemble average of 
xxt, namely C = (xx^). Unfortunately, in our case the 
covariance is difficult to model from first principles, and 
we must resort to an empirically estimated C. We make 
this estimation by taking the time average of the quantity 
xx t over 8.5 hr of LST, estimating a different covariance 
matrix for each baseline and for odd versus even days. 
While an empirical determination of the covariance is 
advantageous in that it captures features that are diffi¬ 
cult to model from first principles, it carries the risk of 
cosmological signal loss (Switzer & Liu 2014). We will 
discuss and quantify this signal loss in Section 5.3. 

To gain some intuition for the action of C -1 on our 
data, let us examine the combination 

z r< = (C ri ) _1 x™ (21) 


for select baselines. This is a crucial step in the analysis 
since it suppresses coherent frequency structures (such as 
those that might arise from residual foregrounds). Note 
that the inverse covariance weighting employed here dif¬ 
fers from that in P14, in that P14 modeled and included 
covariances between different baselines, whereas in our 
current treatment we only consider covariances between 
different frequency channels. Figure 13 compares the 
effects of applying the inverse covariance matrix to a 
data vector that contains foregrounds (and thus contains 
highly correlated frequency structures) to one in which 
foregrounds have been suppressed by the wideband de¬ 
lay filter described in Section 3.3. In the figure, the top 


row corresponds to the data vector x rz for three selected 
baselines in the form of a waterfall plot of visibilities, 
with frequency on the horizontal axis and time on the 
vertical axis. The middle section shows the empirical es¬ 
timate of the covariance by taking the outer product of x 
with itself and averaging over the time axis. Finally, the 
last row shows the results of inverse covariance weight¬ 
ing the data, namely z ri . In every row, the foreground- 
dominated data are shown in the left half of the figure, 
while the foreground-suppressed data are shown in the 
right half. 

Consider the foreground-dominated x rz in Figure 13, 
and their corresponding covariance matrices. The 
strongest modes that are present in the data are the 
eigenmodes of the covariance matrix with the largest 
eigenvalues. Figure 14 shows the full eigenvalue spectrum 
and the four strongest eigenmodes. For the foreground- 
dominated data, one sees that the eigenvalue spectrum 
is dominated by the first few modes, and the correspond¬ 
ing eigenmodes are rather smooth, highly suggestive of 
smooth spectrum foreground sources. The application 
of the inverse covariance weighting down-weights these 
eigenmodes, revealing waterfall plots in the bottom row 
of Figure 13 that look more noise-dominated. With the 
foreground-suppressed portion (right half) of Figure 13, 
the initial x rz vectors already appear noise dominated 
(which is corroborated by the relatively noisy form of 
the eigenvalue spectra in Figure 14). The final z ri vectors 
remain noise-like, although some smooth structure (per¬ 
haps from residual foregrounds) has still been removed, 
and finer scale noise has been up-weighted. 

With intuition established for the behavior of C _1 , we 
may group our identical baselines into five different sets 
and average together z ri vectors for baselines within the 
same set. That is, we form 

* r A = y^(C ri ) _1 x ri , (22) 

ieA 

where A ranges from 1 to 5 and indexes the baseline set. 
At this point, we have 10 weighted data vectors z (5 
baseline sets, each of which has an even day and odd day 
version) for every LST-binned time-step. As discussed 
in the previous section, instrumental noise bias may be 
avoided by forming cross-power spectra rather than auto¬ 
power spectra. Generalizing Equation (13) to our present 
case where we have 10 different data vectors, we have 

q«: = Yl z ?Q« z s> ( 23 ) 

A , B , r,s 

r^s,A^B 

so that auto-power contributions from identical base¬ 
line groups or identical even/odd indices never appear. 
Residual foreground bias will remain in Equation (23), 
but in order to avoid possible signal loss from an overly 
aggressive foreground bias removal scheme, we conserva¬ 
tively allow the foreground bias to remain. Since fore¬ 
ground power will necessarily be positive, residual fore¬ 
grounds will only serve to raise our final upper limits. 

In order to implement Equation (23), it is necessary to 
derive a form for Q a = dC/dp a . To do so, we follow the 
delay spectrum technique of PI2a, where it was shown 
that 

< 24 > 
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Fig. 13. — Visibilities before (top row) and after (bottom row) inverse covariance weighting. Signal covariance (middle row) is esti¬ 
mated empirically, averaging over LST. The three left/right columns show visibilities from three different baselines in a redundant group 
before/after delay filtering, respectively. 


where Vi(t,r) is the delay transform of baseline visibil¬ 
ities given by Equation (8), X and Y are the constants 
that convert from angles and frequency to the co-moving 
coordinate, respectively, ft is the power squared beam 
(see Appendix B of P14), B is the bandwidth, A is the 
spectral wavelength, and ks is Boltzmann’s constant. 
This suggests that in order to estimate the power spec¬ 
trum from visibilities, one only needs to Fourier trans¬ 
form along the frequency axis (converting the spectrum 
into a delay spectrum) before squaring and multiplying 
by a scalar. Thus, the role of Q a in Equation (23) is 
to perform a frequency Fourier transform on each copy 
of z. It is therefore a separable matrix of the form 
Q a = m a m^, where m a is a complex sinusoid of a spe¬ 
cific frequency corresponding to delay mode a. We may 
thus write 

4* = E z ? m « m i z s- ( 25 ) 

A,B,r,s 

r^s,A^B 

With an explicit form for Q a , one now also has the nec¬ 


essary ingredients to compute the Fisher matrix using 
Equation (17). 

Having computed the q a s, we group our results into 
a vector q. This vector of unnormalized bandpowers is 
then normalized to form our final estimates of the power 
spectrum p. As noted above, the normalization occurs 
by the M matrix in Equation (14), and can be any ma¬ 
trix of our desire. Even though the choices of the nor¬ 
malization matrices described above have certain good 
properties, e.g. small error bars or no leakage, we opt 
for a different choice of window function, as follows. We 
first reorder the elements in q (and therefore in F, M, 
and p for consistency) so that the k -modes are listed in 
ascending order, from low k to high fc, with the excep¬ 
tion that we place the highest k bin third after the lowest 
two k bins. (The reason for this exception will be made 
apparent shortly). We then take the Cholesky decompo¬ 
sition of the Fisher matrix, such that F = LL^, where 
L is a lower triangular matrix. Following that, we pick 
M = DL -1 , where D is a diagonal matrix chosen to ad- 
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Eigenmode 


Frequency [MHz] 


Fig. 14.— Eigenvalue spectrum of covariance matrices (left) 
empirically estimated from visibilities before (blue) and after 
(green) delay filtering. The four strongest eigenmodes of the fil¬ 
tered/unfiltered data are plotted on the top/bottom panels on the 
right, respectively. 

here to the normalization constraint that W = MF has 
rows that sum to unity. In this case, the window func¬ 
tion matrix becomes, W = DLA This means that W is 
upper triangular, and with our ordering scheme, has the 
consequence of allowing power to leak from high to low 
k , but not vice versa. Since our k axis is (to a good ap¬ 
proximation) proportional to the delay axis, foregrounds 
preferentially appear at low k because their spectra are 
smooth. Reducing leakage from low k to high k thus 
mitigates leakage of foregrounds into the cleaner, more 
noise-dominated regions. Additionally, our placement of 
the highest k bin as the third element in our reordering 
of p prevents leakage from this edge bin that will con¬ 
tain aliased power. Figure 15 shows the resulting window 
functions. 

Our choice of normalization matrix also has the attrac¬ 
tive property of eliminating error correlations between 
bandpower estimates. Using Equation (19), we have that 

S = DL^LLT-tD = D 2 . (26) 

The error covariance matrix on the bandpowers is thus 
diagonal, which implies that our final data points are 
uncorrelated with one another. This stands in contrast 
to the power-spectrum estimator used in P14, where the 
Blackmann-Harris taper function induced correlated er¬ 
rors between neighboring data points. 

5.3. Covariance Matrix and Signal Loss 

We now discuss some of the subtleties associated with 
empirically estimating the covariance matrix from the 
data. Again, the covariance matrix is defined as the en¬ 
semble average of the outer product of a vector with it¬ 
self, i.e., 

C = (xx f ), (27) 

where x is the data (column) vector used in the analysis. 
In our analysis, we do not have a priori knowledge of 
the covariance matrix, and thus we must resort to em¬ 
pirical estimates (Dillon et al. 2015). As we have alluded 




Fig. 15.— The window function matrix W, as defined in Equa¬ 
tion (15). The i th row corresponds to the window function used in 
the estimate of the power spectrum for the i th k- mode. Color scale 
indicates log 10 W. The inset plot illustrates the window function 
along the dashed line in the upper panel. As described in Sec¬ 
tion 5.2, M in Equation (16) has been chosen so that each window 
function peaks at the waveband while achieving a high degree of 
isolation from at lower fc-modes that are likely to be biased by 
foregrounds. 

to above, we replace the ensemble average with a time 
average that runs from 0 to 8:30 LST hours. 

Since the OQE method for power spectrum estimation 
requires the inversion of C, it is crucial that our empir¬ 
ically estimated covariance be a full rank matrix. With 
our data consisting of visibilities over 20 frequency chan¬ 
nels, the covariance matrix is a 20 x 20 matrix. Thus, 
a necessary condition for our estimate to be full rank is 
for there to be at least 20 independent time samples in 
our average. As noted in Section 3.5 the fringe-rate filter 
used corresponds to averaging time samples for 31 min¬ 
utes. Over the LST range used in this analysis, this cor¬ 
responds to roughly 20 statistically independent modes 
in our data after fringe-rate filtering. We therefore have 
just enough samples for our empirical estimate, and in 
practice, our covariance matrices are invertible and allow 
OQE techniques to be implemented. 

Another potential problem that occurs from empiri¬ 
cally estimating covariances is that it leads to models 
of the covariance matrix that over-fit the noise. In this 
scenario, the covariance matrix tells us that there may 
be modes in the data that should be down-weighted, for 
example, but if the empirical covariance estimates are 
dominated by noise, these may just be random fluctua¬ 
tions that need not be down-weighted. Said differently, 
the weighting of the data by the inverse covariance is 
heavily influenced by the noise in the estimate of the co- 
variance matrix and thus has the ability to down-weight 
valid high-variance samples. Over-fitting the noise in this 
manner carries with it the possibility of cosmological sig¬ 
nal loss. This seems to contradict the conventionally 
recognized feature of OQEs as lossless estimators of the 
power spectrum (Tegmark 1997). However, the standard 
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Fig. 16. — Recovered power spectrum signal as a function of 
injected signal amplitude. Shaded regions indicate the range in 
measured amplitude of power spectrum modes in Figure 18. Er¬ 
ror bars indicate 95% confidence intervals as determined from the 
Monte Carlo simulations described in Section 5.3. Because the re¬ 
covered signal amplitude is a monotonic function of the injected 
signal amplitude, it is possible to invert the effects of signal loss in 
the measured power spectrum values to infer the true signal ampli¬ 
tude on the sky. Over the range of powers measured, the maximum 
correction factor Pin/Pout is less than 1.02 at 97.5% confidence. 
The transition to significantly higher correction factors at larger 
signal amplitudes occurs as the injected signal dominates over the 
foreground modes present in estimates of the data covariance. 

proofs of this property assume that statistics such as C 
are known a priori , which is an assumption that we are 
violating with our empirical estimates. 

In order to deal with possible signal loss, we perform 
simulations of our analysis pipeline, deriving correction 
factors that must be applied to our final constraints. We 
simulate visibilities for Gaussian temperature field with 
a flat amplitude in P(k) that rotates with the sky, which 
is fringe-rate filtered in the same way as the data for 
our fiducial baselines. This signal is processed through 
our pipeline, and the output power spectrum compared 
to the input power spectrum, for various levels of input 
signal amplitude. We repeat this for 40 sky realizations 
at each signal level. Figure 16 shows the resultant sig¬ 
nal loss associated with estimating the covariance matrix 
from the data. Error bars were obtained through boot¬ 
strapping. 

As a function of the increasing input amplitude of the 
simulated power spectra, we find that the ratio of out¬ 
put power to input power decreases, which we interpret 
as signal loss through the use of our empirical OQE of 
the power spectrum. However, since the transfer func¬ 
tion through this analysis is an invertible function, we 
can correct for the transfer by using the output value 
to infer a signal loss that is then divided out to obtain 
the original input signal level. In Figure 16, we see that 
deviations from unity signal transfer begin at power spec¬ 
trum amplitudes of 10 7 mK 2 (h -1 Mpc) 3 . For the range of 
output power spectrum amplitudes in our final estimate 
of the 21cm power spectrum (Figure 18), we show that 


signal loss is < 2% at 95% confidence. 

TABLE 1 

SIGNAL LOSS VERSUS ANALYSIS STAGE 


Analysis Stage Typical Loss Maximum Loss 


Bandpass Calibration 

< 2 x 10‘ 

“ 7 % 

3.0% 

Delay Filtering 

1.5 x 10“ 

3 % 

4.8% 

Fringe-rate Filtering 

28.1% 


28.1% 

Quadratic Estimator 

< 2.0% 


89.0% 

Median of Modes 

30.7% 


30.7% 


As shown in Table 1, the signal loss we characterize for 
quadratic estimation of the power spectrum band pow¬ 
ers is tabulated along with the signal loss associated with 
each other potentially lossy analysis stage (see Figure 3). 
We correct for the signal loss in each stage by multiply¬ 
ing the final power spectrum results by the typical loss 
for each stage, except for modes within the horizon limit 
and immediately adjacent to the horizon limit, where we 
apply the maximum signal loss correction to be conser¬ 
vative. 


5.4. Bootstrapped Averaging and Errors 

When estimating our power spectra via OQEs, we gen¬ 
erate multiple samples of the power spectrum in order to 
apply the bootstrap method to calculate our error bars. 
In detail, the power spectrum estimation scheme pro¬ 
posed above requires averaging at several points in the 
pipeline: 

1. Visibilities are averaged into five baseline groups 
after inverse covariance weighting (see Equation 
( 22 )) 

2. Power spectrum estimates from each of the three 
redundant baseline types (described in Section 2) 
are averaged together. 

3. Power spectrum estimates from each LST are av¬ 
eraged together. 

With the bootstrapping technique, we do not directly 
perform these averages. Instead, one draws random sam¬ 
ples within the three-dimensional parameter space spec¬ 
ified above, with replacement, until one has as many 
random samples as there are total number of parame¬ 
ter space points. These random samples are then propa¬ 
gated through the power spectrum pipeline and averaged 
together as though they were the original data. This 
forms a single estimate (a “bootstrap”) of P(k). Re¬ 
peating random draws allows one to quantify the inher¬ 
ent scatter—and hence the error bars—in our estimate 
of P(k). When plotting A 2 {k) = k 3 P(k)/27T 2 instead of 
P(k), we bin power falling in +k and —fc, and so we addi¬ 
tionally randomize the inclusion of positive and negative 
k bins. 

We compute a total of 400 bootstraps. In combining 
independent samples for our final power spectrum esti¬ 
mate, we elect to use the median, rather than the mean, 
of the samples. One can see the behavior of both statis¬ 
tics in Figure 17, where we show how the absolute value 
of A 2 (k) integrates down as more independent samples 
are included in the mean and median. In this plot, one 
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Fig. 17.— Absolute value of the cumulative mean (left) and median (right), as a function of number of modes of the power spectrum 
band power for fey modes ranging from —0.49 (red) to 0.44h Mpc -1 (violet). Here, modes are defined as samples from different redundant 
baseline groups and LSTs. This Allen variance plot shows modes averaging down as the square root of number of modes combined until a 
signal floor is reached. The difference in behavior between the mean and median is an indication of outliers in the distribution of values, 
likely as a result of foreground contamination. We use the median in the estimation of the power spectrum in Figure 18, along with a 
correction factor compensating for the difference between the mean and median in estimating variance. 


can see modes integrating down consistent with a noise- 
dominated power spectrum until they bottom out on a 
signal. In the noise-dominated regime, the mean and the 
median behave similarly. However, we see that the me¬ 
dian routinely continues to integrate down as noise for 
longer. This is an indication that the mean is skewed 
by outlier modes, suggesting variations beyond thermal 
noise. The magnitude of the difference is also not consis¬ 
tent with the Rayleigh distribution expected of a cosmo¬ 
logical power spectrum limited by cosmic variance. For 
a Rayleigh distribution, the median is In 2 ^ 0.69 times 
the mean. Instead, we interpret the discrepancy as a 
sign of contributions from foregrounds, which are neither 
isotropic nor Gaussian distributed. Since median pro¬ 
vides better rejection of outliers in the distribution that 
might arise from residual foreground power, we choose to 
use the median statistic to combine measurements across 
multiple modes. As listed in Table 1, we apply a l/ln2 
correction factor to our power spectrum estimates to in¬ 
fer the mean from the median of a Rayleigh distribution. 

6. RESULTS 

6.1. Power Spectrum Constraints 

To summarize the previous section, we follow the power 
spectrum analysis procedure outlined in Section 5.2, we 
incoherently combine independent power spectrum mea¬ 
surements made at different times and with different 
baseline groups using the median statistic. As described 
in Section 5.4, we bootstrap over all of these independent 
measurements, as well as over the selection of baselines 
included in the power spectrum analysis for each base¬ 
line group, in order to estimate the error bars on the 
spherically averaged power spectrum P(fc), where posi¬ 
tive and negative fey measurements are kept separate for 
diagnostic purposes. In the estimation of the dimension¬ 
less power spectrum A 2 (k) = k 3 P(k)/ 27r 2 , the folding of 


±fc|| is handled along with the rest of the bootstrapping 
over independent modes. Finally, the measured values 
for P(k) and A 2 {k) are corrected for signal loss through 
all stages of analysis, as summarized in Table 1. 

The final results are plotted in Figure 18. For the 
first two modes outside of the horizon where A 2 {k) is 
measured, we have clear detections. We attribute these 
to foreground leakage from inside the horizon related to 
the convolution kernels in Equation (8) (either from the 
chromaticity of the antenna response, or from the inher¬ 
ent spectrum of the foregrounds themselves). Somewhat 
more difficult to interpret are the 2.4cr excess at k « 
0.30h Mpc -1 and the 2.9a excess at k « 0.44h Mpc -1 . 
Having two such outliers is unlikely to be chance. 

In examining the effects on the power spectrum of 
omitting various stages of analysis (see Figure 19), we 
see a pronounced excess in the green curve corresponding 
to the omission of crosstalk removal in fringe-rate filter¬ 
ing. While the signal is heavily attenuated in the filtering 
step, it remains a possibility that the remaining detec¬ 
tions are associated with instrumental crosstalk. We do 
note, however, that the qualitative shape of the excess 
in the crosstalk-removed data does not appear to match 
that of the crosstalk-containing data. 

Another likely possibility is that the signal might be 
associated with foregrounds. Foregrounds, which are not 
generally isotropically distributed on the sky, are likely 
to be affected by the spatial filtering associated with 
fringe-rate filtering, whereas a statistically isotropic sig¬ 
nal is not. Indeed, we see that excesses in many modes 
measured with using the P14-stype time-domain filtering 
(blue in Figure 19) decrease significantly using the im¬ 
proved fringe-rate filter. As discussed in Parsons et al. 
(2015), the normalization applied to for fringe-rate 
filtering correctly compensates for the effect of this fil¬ 
tering on power-spectral measurements of a statistically 
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Fig. 18.— Measured power spectrum (black dots with 2cr error bars) at z = 8.4 resulting from a 135 day observation with PAPER-64. 
The dashed vertical lines at 0.6h Mpc -1 show the bounds of the delay filter described in Section 3.3. The predicted 2 a upper limit in 
the absence of the a celestial signal is shown in dashed cyan, assuming Tsys = 500AT. The triangles indicate 2 cr upper limits from GMRT 
(Paciga et al. 2011) (yellow) at z = 8.6, MWA (Dillon et al. 2014) at z = 9.5 (magenta), and the previous PAPER upper limit (P14) at 
z = 7.7 (green). The magenta curve shows a predicted model 21cm power spectrum at 50% ionization (Lidz et al. 2008). 



Fig. 19.— Diagnostic power spectra in the style of Figure 18 illustrating the impact of various analysis stages. The blue power spectrum 
uses the P14 fringe-rate filter combined with crosstalk removal. Green illustrates the result using the improved fringe-rate filter, but without 
crosstalk removal. A power spectrum derived without the application of OMNICAL is shown in orange. Black includes improved fringe-rate 
filtering, crosstalk removal, and OMNICAL calibration; it is the same power spectrum shown in Figure 18. 
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Fig. 20. — Posterior distribution of power spectrum amplitude 
for a flat A 2 (k) power spectrum over 0.15 < k < 0.5 h Mpc -1 
(solid black), assuming Gaussian error bars. The blue and orange 
vertical lines correspond to the la and 2a bounds, respectively. 

isotropic Gaussian sky signal. We can surmise from 
any significant change in amplitude of the excess under 
fringe-rate filtering that it arises from emission that vi¬ 
olates these assumptions. We conclude, therefore, that 
this excess is unlikely to be cosmic reionization, and is 
more likely the result of non-Gaussian foregrounds. As 
discussed earlier, one possible culprit is polarization leak¬ 
age (Jelic et al. 2010, 2014; Moore et al. 2013), although 
further work will be necessary to confirm this. The inter¬ 
pretation of the signal as polarization leakage is, however, 
rather high to be consistent with recent measurements 
in Stokes Q presented in Moore et al. (2015), where the 
leakage is constrained to be < 100 mK^ for all k. 

That the excesses at k ~ 0.30 and 0.44h Mpc -1 are 
relatively unaffected by the filtering could be an indi¬ 
cation that they are more isotropically distributed, but 
more likely, it may mean that the simply arise closer to 
the center of the primary beam where they are down¬ 
weighted less. Both excesses appear to be significantly 
affected by omitting OMNICAL calibration (orange in 
Figure 19). This could be interpreted as indicating the 
excess is a modulation induced by frequency structure 
in the calibration solution. However, OMNICAL is con¬ 
strained to prohibit structure common to all baselines, 
so a more likely interpretation is that this faint feature 
decorrelates without the precision of redundant calibra¬ 
tion. To determine the nature of these particular ex¬ 
cesses, further work will be necessary. 

In order to aggregate the information presented in the 
power spectrum into a single upper limit, we fit a flat 
A 2 {k) model to measurements in the range 0.15 < k < 
0.5 h Mpc -1 . We use a uniform prior of amplitudes be¬ 
tween -5000 and 5000 mK 2 , and assume measurement 
errors are Gaussian. Figure 20 shows the posterior dis¬ 
tribution of the fit. From this distribution, we determine 
a mean of (18.9 mK) 2 and a 2a upper limit of (22.4 mK) 2 
. The measured mean is inconsistent with zero at the 
4.7 a level, indicating that we are detecting a clear power 
spectrum excess at k > 0.15h Mpc -1 . 

We suspect that the excess in our measured power 


spectrum is likely caused by crosstalk and foregrounds. 
We therefore suggest ignoring the lower bound on the 
power spectrum amplitude as not being of relevance for 
the cosmological signal. On the other hand, since fore¬ 
ground power is necessarily positive, the 2cr upper limit 
of (22.4 mK) 2 at z = 8.4, continues to serve as a con¬ 
servative upper limit. This significantly improves over 
the previous best upper limit of (41 mK) 2 at z = 7.7 
reported in P14. As we show below and in greater detail 
in Pober et al. (2015), this limit begins to have implica¬ 
tions for the heating of the IGM prior to the completion 
of reionization. 


6.2. Spin Temperature Constraints 


In this section, we examine the implication of the mea¬ 
sured upper limits on 21cm emission in Figure 18 on the 
spin temperature of the 21cm line at z = 8.4. In a forth¬ 
coming paper (Pober et al. 2015), we conduct a thorough 
analysis of the constraints that can be put on the IGM 
using a simulation-based framework. As a complement 
to that more thorough analysis, we focus here on a sim¬ 
pler parameterization of the shape of the 21cm power 
spectrum signal. 

The brightness temperature of the 21cm signal, 5T b , 
arising from the contrast between the cosmic microwave 
background, T 7 , and the spin temperature, T s , is given 
by 


ST b 


r -r 7 

1 + Z 


(i-O 


r -r 7 

1 + Z 


T, 


(28) 


where temperatures are implicitly a function of redshift 
z, and the approximation holds for low optical depth, r. 
The optical depth is given by (Zaldarriaga et al. 2004) 


3c ]_q t~i i 
Wkv$T s H(z) 


(29) 


where Aiq is the Einstein A coefficient for the 21cm tran¬ 
sition, riHi is the density of the neutral hydrogen, H(z) 
is the Hubble constant, xhi is the neutral fraction of hy¬ 
drogen, 5 is the local baryon overdensity, v$ is the rest 
frequency of the 21cm transition, and the remainder are 
the usual constants. Plugging in the cosmological pa¬ 
rameters from Planck Collaboration et al. (2015), we get 


5T b ^T 0 x ni {l + 5)^ (30) 


where £ = 1 — T 7 /T s and To = 26.7mK^(1 + z)/ 10. 

If the spin temperature is larger than T 7 , we get the 
21cm signal in emission with respect to the CMB, and 
£ ~ 1. However, if T s is less than T 7 , 5T b is negative and 
£ can potentially become large. 

As in P14, we consider a “weak heating” scenario 
in which T s is coupled to the gas temperature via the 
Wouthuysen-Field effect (Wouthuysen 1952; Field 1958; 
Hirata 2006), but little heating has taken place prior to 
reionization, so that T s < T 7 . In this scenario, because 
we have assumed little heating, we can approximate £ as 
having negligible spatial dependence, and therefore Tq £ 2 
becomes a simple multiplicative scalar to the 21cm power 
spectrum: 

Al 1 (k)=T^e(z)AUk), (31) 

where A 2 (&) is the dimensionless HI power spectrum. 

As shown in P14, the maximum value of the prefactor 
in Equation (31) is given by a no-heating scenario where 
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Fig. 21.— Constraints on the 21cm spin temperature at z = 8.4, assuming the patchy reionization model in Equations (31) and (33), 
which hold in the limit that T s < Tcmb- 


the spin temperature follows the kinetic gas temperature, 
which is held in equilibrium with the CMB via Compton 
scattering until z^ e c ~ 150 (Furlanetto et al. 2006) and 
then cools adiabatically as (1 + z) 2 . In this case, £ is 
given by 


1 + ^dec ^ 150 

1 + z ~ 1 + z' 


(32) 


At z = 8.4, this corresponds to a minimum bound on the 
spin temperature of T s > 1.5 K. 

We can now flip this argument around and, for a mea¬ 
sured upper bound on A^ 1 (k), we can use models for 
A f(k) in Equation (31) to place a bound on T s . We con¬ 
sider a class of “patchy” reionization models (P12a;P14) 
which approximates the ionization power spectrum as 
flat between minimum and maximum bubble sizes, k m i n 
and k max , respectively: 

A f(k) = (xrn - Xhi )/In (fcmax/fcmin)- (33) 
For combinations of k m i n and k max , we determine the 
minimum spin temperature implied by the 2a 21 cm 
power spectrum upper limits shown in Figure 18. Figure 
21 shows the results of these bounds for neutral frac¬ 
tions of xhi = 0.1, 0.3, 0.5, 0.7, and 0.9. In almost all 
cases (excepting xm = 0.1, 0.9 for fc m i n < 0.1ft- Mpc -1 ), 
we find that T s > 3 K, indicating that our measurements 
are inconsistent with the spin temperature being coupled 
to a kinetic temperature governed strictly by adiabatic 
expansion. 

Our results become more interesting in the range of 
&min ~ 0.1 and k max ~ 30 representative of fiducial sim¬ 
ulations (Zahn et al. 2007; Lidz et al. 2008). For neutral 
fractions of 0.3, 0.5, and 0.7, we find that T s > 4K. 
Pober et al. (2015) improves on these results by using 
a simulation-based framework, rather than relying on 
coarse parametrizations of the power spectrum shape. 
They compare the limits they find to the amount of heat¬ 
ing possible given the currently observed star formation 
rates in high-redshift galaxy populations (Bouwens et al. 
2014; McLeod et al. 2015) and assumptions about the re¬ 
lationship between star formation rates and X-ray lumi¬ 
nosities (Furlanetto et al. 2006; Pritchard & Loeb 2008; 
Fialkov et al. 2014). Assuming the midpoint of reioniza¬ 
tion lies close to z = 8.4 (a reasonable assumption given 


that Planck Collaboration et al. 2015 suggests a mid¬ 
point of z = 8.8), both the bounds found in this paper 
and Pober et al. (2015) show evidence for heating that 
places constraints on the possible values for the star for¬ 
mation rate/X-ray luminosity correlation given certain 
models of the star formation rate density redshift evo¬ 
lution. We refer the reader to Pober et al. (2015) for a 
detailed examination of these results. 

7. DISCUSSION 

The improvement in our results over those in P14 are 
the result of four major advances: 

1. the expansion of PAPER to 64 antennas doubled 
our instrument’s power spectrum sensitivity, 

2. using OMNICAL for redundant calibration signif¬ 
icantly improved the clustering of measurements 
over the previous implementation of LOGCAL used 
in P14, 

3. fringe-rate filtering further improved power spec¬ 
trum sensitivity by ^50% and suppressed system- 
atics associated with foregrounds low in the pri¬ 
mary beam, and 

4. moving from a lossless quadratic estimator target¬ 
ing difference modes in redundant measurements 
to an OQE (with carefully calibrated signal loss) 
significantly reduced contamination from residual 
foregrounds. 

Figure 19 illustrates the effect of some of these advances 
on the final power spectrum. Other important advances 
include the use of the median statistic to reduce the im¬ 
pact of non-Gaussian outliers in power-spectral measure¬ 
ments, and the use of a Cholesky decomposition of the 
Fisher information matrix to help reduce leakage from 
highly contaminated modes within the wedge. 

These new techniques and improvements to calibration 
have reduced the measured bias in nearly all wavebands 
by an order of magnitude or more. The use of OMNICAL 
to accurately calibrate the relative complex gains of the 
antennas has shown to be a major improvement to the 
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data-reduction pipeline. The accuracy and improvement 
of this calibration brings redundant baselines into im¬ 
pressive agreement with one another (see Figures 4 and 
10), and provides important diagnostic information for 
monitoring the health of the array, flagging RFI events, 
and otherwise assessing data quality. Fringe-rate filter¬ 
ing, which is described in greater depth in (Parsons et al. 
2015), is also proving to be a flexible and powerful tool 
for controlling direction-dependent gains and improving 
sensitivity. 

As sensitivity improves, it will be possible to determine 
more accurately than Moore et al. (2015) what the ac¬ 
tual level of polarized emission, and thus leakage, may 
be. Independent fringe-rate filtering of the XX and YY 
polarizations prior to summation has the potential to 
better match these polarization beams and further sup¬ 
press the leakage signal if the polarized signal turns out 
to be significant. 

The end result is a major step forward, both for PA¬ 
PER and for the field of 21cm cosmology. While we have 
not yet made a detection of the 21cm cosmological sig¬ 
nal, our limits are now within the range of some of the 
brighter models. As discussed in Pober et al. (2015), an¬ 
other order-of-magnitude improvement in sensitivity will 
make 21cm measurements highly constraining. 

8. CONCLUSIONS 

We present new upper limits on the 21 cm reionization 
power spectrum at z = 8.4, showing a factor of ^4 im¬ 
provement over the previous best result (P14). We find 
a 2a upper limit of (22.4 mK) 2 by fitting a flat power 
spectrum in a k range from 0.15 < k < 0.5 h Mpc -1 to 
the dimensionless power spectrum, A 2 (fc), measured by 
the PAPER instrument. We coarsely show that these 
upper limits imply a minimum spin temperature for hy¬ 
drogen in the IGM. Although these limits are dependent 
on the model chosen for the power spectrum, we use a 
patchy reionization model to show that limits of T s > 4 K 
are fairly generic for models with ionization fractions be¬ 
tween 0.3 and 0.7. A more detailed analysis of the im¬ 
plied constraints on spin temperature using semi-analytic 
reionization/heating simulations is presented in a forth¬ 
coming paper (Pober et al. 2015). 

The power spectrum results that we present continue 
to be based on the delay-spectrum approach to fore¬ 
ground avoidance presented in PI2b and first applied in 
P14. The application of a delay filter over a wide band¬ 
width continues to be one of the most powerful tech¬ 
niques yet demonstrated for managing bright smooth- 
spectrum foregrounds. In this paper, we extend the 
analysis in P14 with improved fringe-rate filtering, im¬ 
proved redundant calibration with OMNICAL, and with 
an OQE that, while not perfectly lossless, is more adept 
at down-weighting residual foregrounds. The combined 
effect of these improvements leaves a power-spectral mea¬ 
surement that is not consistent with zero at the 4.7a- 
level, which we expect is a result of contamination from 
crosstalk and foregrounds. With the expansion of PA¬ 
PER to 64 antennas, the extended 135 day observing 
campaign, and the added sensitivity benefits of fringe- 
rate filtering, combined with the optimization of antenna 
positions in PAPER for highly redundant measurements, 
this thermal noise limit is beginning to enter the realm 


of constraining realistic models of reionization. 

Forthcoming from PAPER will be two seasons of ob¬ 
servation with a 128-element array. Following the same 
analysis as presented here, that data set is expected to 
improve over the PAPER-64 sensitivity by a factor of 
~4 (in mK 2 ), with the potential for another boost to 
sensitivity should the new 16-m baselines provided in 
the PAPER-128 array configuration prove to be usable. 
There also remains the potential for further improve¬ 
ments to sensitivity through the use of longer baselines, 
if foregrounds can be managed effectively. As has been 
done recently for PAPER-32 (Jacobs et al. 2015; Moore 
et al. 2015), future work will also extend PAPER-64 anal¬ 
ysis to a range of redshifts and examine the power spec¬ 
trum of polarized emission. 

With recent breakthroughs in foreground management, 
the sensitivity limitations of current experiments are be¬ 
coming clear. Although collecting area is vital, as dis¬ 
cussed in Pober et al. (2014), the impact of collecting 
area depends critically on the interplay of array config¬ 
uration with foregrounds. Despite a large spread in col¬ 
lecting areas between PAPER, the MWA, and LOFAR, 
in the limit that foreground avoidance is the only viable 
strategy, these arrays all deliver, at best, comparable low- 
significance detections of fiducial models of reionization. 
To move beyond simple detection, next-generation in¬ 
struments must deliver much more collecting area with 
very compact arrays. 

The Hydrogen Epoch of Reionization Array (HERA) 
and the low frequency Square Kilometre Array (SKA- 
Low) are next generation experiments that aim to make 
significant detections of the 21 cm power spectrum and 
begin characterizing it. SKA-Low has secured pre¬ 
construction funding for a facility in western Australia. 
HERA was recently granted funding for its first phase 
under the National Science Foundation’s Mid-Scale In¬ 
novations Program. HERA uses a close packing of 14-m 
diameter dishes designed to minimize the width of the 
delay-space kernel A r in Equation (8). Sensitivity fore¬ 
casts for a 331-element HERA array and SKA-Low show 
that they can deliver detections of the 21cm reionization 
signal at a significance of 39a and 21a, respectively, using 
the same the conservative foreground avoidance strategy 
employed in this paper (Pober et al. 2014). HERA is the 
natural successor to PAPER, combining a proven exper¬ 
imental strategy with the sensitivity to deliver results 
that will be truly transformative for understanding of 
our cosmic dawn. 
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